Goto

Collaborating Authors

 batch bayesian optimization


e8f2779682fd11fa2067beffc27a9192-Supplemental.pdf

Neural Information Processing Systems

In this analysis, we assume that evaluating the GP prior mean and kernel functions (and the corresponding derivatives) takesO(1)time. For each fantasy model, we need to compute the posterior mean and covariance matrix for the L points (x,w1:L), on which we draw the sample paths. This results in a total cost ofO(KML2)to generate all samples. The SAA approach trades a stochastic optimization problem with a deterministic approximation, which can be efficiently optimized. Suppose that we are interested in the optimization problemminxEω[h(x,ω)].


The Parallel Knowledge Gradient Method for Batch Bayesian Optimization

Neural Information Processing Systems

In many applications of black-box optimization, one can evaluate multiple points simultaneously, e.g. when evaluating the performances of several different neural network architectures in a parallel computing environment. In this paper, we develop a novel batch Bayesian optimization algorithm --- the parallel knowledge gradient method. By construction, this method provides the one-step Bayes optimal batch of points to sample. We provide an efficient strategy for computing this Bayes-optimal batch of points, and we demonstrate that the parallel knowledge gradient method finds global optima significantly faster than previous batch Bayesian optimization algorithms on both synthetic test functions and when tuning hyperparameters of practical machine learning algorithms, especially when function evaluations are noisy.


Batch Bayesian Optimization for High-Dimensional Experimental Design: Simulation and Visualization

Mia, Imon, Tiihonen, Armi, Ernst, Anna, Srivastava, Anusha, Buonassisi, Tonio, Vandenberghe, William, Hsu, Julia W. P.

arXiv.org Machine Learning

Bayesian Optimization (BO) is increasingly used to guide experimental optimization tasks. To elucidate BO behavior in noisy and high-dimensional settings typical for materials science applications, we perform batch BO of two six-dimensional test functions: an Ackley function representing a needle-in-a-haystack problem and a Hartmann function representing a problem with a false maximum with a value close to the global maximum. We show learning curves, performance metrics, and visualization to effectively track the evolution of optimization in high dimensions and evaluate how they are affected by noise, batch-picking method, choice of acquisition function,and its exploration hyperparameter values. We find that the effects of noise depend on the problem landscape; therefore, prior knowledge of the domain structure and noise level is needed when designing BO. The Ackley function optimization is significantly degraded by noise with a complete loss of ground truth resemblance when noise equals 10 % of the maximum objective value. For the Hartmann function, even in the absence of noise, a significant fraction of the initial samplings identify the false maximum instead of the ground truth maximum as the optimum of the function; with increasing noise, BO remains effective, albeit with increasing probability of landing on the false maximum. This study systematically highlights the critical issues when setting up BO and choosing synthetic data to test experimental design. The results and methodology will facilitate wider utilization of BO in guiding experiments, specifically in high-dimensional settings.


Reviews: The Parallel Knowledge Gradient Method for Batch Bayesian Optimization

Neural Information Processing Systems

The paper is well written and easy to follow. Parallelization of BO is an important subject for practical hyperparameter optimization and the proposed approach is interesting and more elegant than most existing approaches I am aware of. The fact a Bayes-optimal batch is determined is very promising. The authors assume independent normally distributed errors, which is common in most BO methods based on Gaussian processes. However, in hyperparameter optimization this assumption is problematic, since measurements errors represent the difference between generalization performance and empirical estimates (e.g., through cross-validation).


Asynchronous Batch Bayesian Optimization with Pipelining Evaluations for Experimental Resource$\unicode{x2013}$constrained Conditions

Taguchi, Yujin, Shibuya, Yusuke, Hiki, Yusuke, Morikura, Takashi, Yamada, Takahiro G., Funahashi, Akira

arXiv.org Artificial Intelligence

Bayesian optimization is efficient even with a small amount of data and is used in engineering and in science, including biology and chemistry. In Bayesian optimization, a parameterized model with an uncertainty is fitted to explain the experimental data, and then the model suggests parameters that would most likely improve the results. Batch Bayesian optimization reduces the processing time of optimization by parallelizing experiments. However, batch Bayesian optimization cannot be applied if the number of parallelized experiments is limited by the cost or scarcity of equipment; in such cases, sequential methods require an unrealistic amount of time. In this study, we developed pipelining Bayesian optimization (PipeBO) to reduce the processing time of optimization even with a limited number of parallel experiments. PipeBO was inspired by the pipelining of central processing unit architecture, which divides computational tasks into multiple processes. PipeBO was designed to achieve experiment parallelization by overlapping various processes of the experiments. PipeBO uses the results of completed experiments to update the parameters of running parallelized experiments. Using the Black-Box Optimization Benchmarking, which consists of 24 benchmark functions, we compared PipeBO with the sequential Bayesian optimization methods. PipeBO reduced the average processing time of optimization to about 56% for the experiments that consisted of two processes or even less for those with more processes for 20 out of the 24 functions. Overall, PipeBO parallelizes Bayesian optimization in the resource-constrained settings so that efficient optimization can be achieved.


Batch Bayesian Optimization on Permutations using the Acquisition Weighted Kernel

Neural Information Processing Systems

In this work we propose a batch Bayesian optimization method for combinatorial problems on permutations, which is well suited for expensive-to-evaluate objectives. We first introduce LAW, an efficient batch acquisition method based on determinantal point processes using the acquisition weighted kernel. Relying on multiple parallel evaluations, LAW enables accelerated search on combinatorial spaces. We then apply the framework to permutation problems, which have so far received little attention in the Bayesian Optimization literature, despite their practical importance. We call this method LAW2ORDER.


Batch Bayesian Optimization via Simulation Matching

Neural Information Processing Systems

Bayesian optimization methods are often used to optimize unknown functions that are costly to evaluate. Typically, these methods sequentially select inputs to be evaluated one at a time based on a posterior over the unknown function that is updated after each evaluation. There are a number of effective sequential policies for selecting the individual inputs. In many applications, however, it is desirable to perform multiple evaluations in parallel, which requires selecting batches of multiple inputs to evaluate at once. In this paper, we propose a novel approach to batch Bayesian optimization, providing a policy for selecting batches of inputs with the goal of optimizing the function as efficiently as possible.


Batch Bayesian Optimization via Particle Gradient Flows

Crovini, Enrico, Cotter, Simon L., Zygalakis, Konstantinos, Duncan, Andrew B.

arXiv.org Artificial Intelligence

Bayesian Optimisation (BO) methods seek to find global optima of objective functions which are only available as a black-box or are expensive to evaluate. Such methods construct a surrogate model for the objective function, quantifying the uncertainty in that surrogate through Bayesian inference. Objective evaluations are sequentially determined by maximising an acquisition function at each step. However, this ancilliary optimisation problem can be highly non-trivial to solve, due to the non-convexity of the acquisition function, particularly in the case of batch Bayesian optimisation, where multiple points are selected in every step. In this work we reformulate batch BO as an optimisation problem over the space of probability measures. We construct a new acquisition function based on multipoint expected improvement which is convex over the space of probability measures. Practical schemes for solving this `inner' optimisation problem arise naturally as gradient flows of this objective function. We demonstrate the efficacy of this new method on different benchmark functions and compare with state-of-the-art batch BO methods.


Cell Culture: Implementing robotics and artificial intelligence

#artificialintelligence

Most animals, including humans, are made up of different organs and tissues formed by specialized cells that perform specific roles. When tissues degenerate or become damaged, the affected cells have to be replaced so the tissues can keep performing their roles. This regenerative potential exists thanks to populations of stem cells in each tissue, which can divide to produce more stem cells – maintaining a constant pool of stem cells for repair – or differentiate into specialized cells to replace damaged cells. The division and differentiation of stem cells needs to be in balance: if too many stem cells differentiate, the pool of stem cells could become exhausted, but if the stem cells divide uncontrollably this could lead to cancer. However, this balance often fails with age, or due to environmental or genetic reasons.


Batch Bayesian Optimization on Permutations using Acquisition Weighted Kernels

Oh, Changyong, Bondesan, Roberto, Gavves, Efstratios, Welling, Max

arXiv.org Machine Learning

In this work we propose a batch Bayesian optimization method for combinatorial problems on permutations, which is well suited for expensive cost functions on permutations. We introduce LAW, a new efficient batch acquisition method based on the determinantal point process, using an acquisition weighted kernel. Relying on multiple parallel evaluations, LAW accelerates the search for the optimal permutation. We provide a regret analysis for our method to gain insight in its theoretical properties. We then apply the framework to permutation problems, which have so far received little attention in the Bayesian Optimization literature, despite their practical importance. We call this method LAW2ORDER. We evaluate the method on several standard combinatorial problems involving permutations such as quadratic assignment, flowshop scheduling and the traveling salesman, as well as on a structure learning task.